Learning transformed product distributions
نویسندگان
چکیده
We consider the problem of learning an unknown product distribution X over {0, 1}n using samples f (X) where f is a known transformation function. Each choice of a transformation function f specifies a learning problem in this framework. Information-theoretic arguments show that for every transformation function f the corresponding learning problem can be solved to accuracy ǫ, using Õ(n/ǫ2) examples, by a generic algorithm whose running time may be exponential in n. We show that this learning problem can be computationally intractable even for constant ǫ and rather simple transformation functions. Moreover, the above sample complexity bound is nearly optimal for the general problem, as we give a simple explicit linear transformation function f (x) = w · x with integer weights wi ≤ n and prove that the corresponding learning problem requires Ω(n) samples. As our main positive result we give a highly efficient algorithm for learning a sum of independent unknown Bernoulli random variables, corresponding to the transformation function f (x) = ∑n i=1 xi. Our algorithm learns to ǫ-accuracy in poly(n) time, using a surprising poly(1/ǫ) number of samples that is independent of n. We also give an efficient algorithm that uses log n · poly(1/ǫ) samples but has running time that is only poly(log n, 1/ǫ).
منابع مشابه
Minimax Estimation of the Scale Parameter in a Family of Transformed Chi-Square Distributions under Asymmetric Squared Log Error and MLINEX Loss Functions
This paper is concerned with the problem of finding the minimax estimators of the scale parameter ? in a family of transformed chi-square distributions, under asymmetric squared log error (SLE) and modified linear exponential (MLINEX) loss functions, using the Lehmann Theorem [2]. Also we show that the results of Podder et al. [4] for Pareto distribution are a special case of our results for th...
متن کاملLearning Mixtures of Product Distributions Using Correlations and Independence
We study the problem of learning mixtures of distributions, a natural formalization of clustering. A mixture of distributions is a collection of distributions D = {D1, . . .DT }, and mixing weights, {w1, . . . , wT } such that
متن کاملLearning Mixtures of Discrete Product Distributions using Spectral Decompositions
We study the problem of learning a distribution from samples, when the underlying distribution is a mixture of product distributions over discrete domains. This problem is motivated by several practical applications such as crowdsourcing, recommendation systems, and learning Boolean functions. The existing solutions either heavily rely on the fact that the number of mixtures is finite or have s...
متن کاملActivized Learning: Transforming Passive to Active with Improved Label Complexity
We study the theoretical advantages of active learning over passive learning. Specifically, we prove that, in noise-free classifier learning for VC classes, any passive learning algorithm can be transformed into an active learning algorithm with asymptotically strictly superior label complexity for all nontrivial target functions and distributions. We further provide a general characterization ...
متن کاملActivized Learning: Transforming Passive to Active with Improved Label Complexity∗ Working Notes: Updated January 2011
We study the theoretical advantages of active learning over passive learning. Specifically, we prove that, in noise-free classifier learning for VC classes, any passive learning algorithm can be transformed into an active learning algorithm with asymptotically strictly superior label complexity for all nontrivial target functions and distributions, in many cases without significant loss in comp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1103.0598 شماره
صفحات -
تاریخ انتشار 2011